This is my PM566 final project.
In this project, I used the dataset which reflects arrest incidents in the City of Los Angeles from 2010 to 2019. There are 1,320,000 rows in this dataset, each row represents an arrest. The analysis is trying to figure out whether there is a difference between the number of arrest incidents in the City of Los Angeles from 2010 to 2019 and whether there are any relationship between subject age and sex, area, time and arrest type in the arrest situation in 2019. The graphs below focus on five main variables which represent subject age, subject sex, patrol divisions location and the type of charge the individual was arrested for repectively.
The Barchart shows the total number of arrest cases in each area in 2010 and 2019. The number of cases in each area in Los Angeles in 2019 seems to be less than that in 2010.
The scatter plot shows number of cases and average age of subjects. The size indicates the number of cases, larger size means there is more cases. The area is indicates by color.
The boxplot shows age in different arrest type and sex more clearly. It seems that males are older than females in most arrest type. People who are arrested because of dependent reason seem to be youngest.
The histogram shows that the age range are overlap and the age of most subjects are concentrated at 25 to 28 in most arrest type.
The table above is an interactive datatable which shows the total count of cases grouped by arrest date, arrest time, patrol divisions location, arrest type and sex.
-The number of cases in each area in Los Angeles in 2019 seems to be less than that in 2010. -Subject age seems to be affected by arrest type, although there are some overlaps. -Males tend to be older than females. -Subject age seems to be different in different patrol divisions.
https://github.com/jiqingwu1997/PM566_Final/raw/main/report%20.pdf